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0. Why quantum computing? 

Information processing (computing) is the dynamical evolution of a highly orga- 
nized physical system produced by technology (computer) or nature (brain). The 
initial state of this system is (determined by) its input; its final state is the output. 
Physics describes nature in two complementary modes: classical and quantum. Up 
to the nineties, the basic mathematical models of computing, Turing machines, 
were classical objects, although the first suggestions for studying quantum models 
date back at least to 1980. 

Roughly speaking, the motivation to study quantum computing comes from 
several sources: physics and technology, cognitive science, and mathematics. We 
will briefly discuss them in turn. 

(i) Physically, the quantum mode of description is more fundamental than the 
classical one. In the seventies and eighties it was remarked that, because of the 
superposition principle, it is computationally unfeasible to simulate quantum pro- 
cesses on classical computers ([Po], [Fel]). Roughly speaking, quantizing a classical 
system with iV states we obtain a quantum system whose state space is an (TV — 1)- 
dimensional complex projective space whose volume grows exponentially with N. 
One can argue that the main preoccupation of quantum chemistry is the struggle 
with resulting difficulties. Reversing this argument, one might expect that quan- 
tum computers, if they can be built at all, will be considerably more powerful than 
classical ones ([Fel], [Ma2]). 

Progress in the microfabrication techniques of modern computers has already led 
us to the level where quantum noise becomes an essential hindrance to the error- 
free functioning of microchips. It is only logical to start exploiting the essential 
quantum mechanical behavior of small objects in devising computers, instead of 
neutralizing it. 

(ii) As another motivation, one can invoke highly speculative, but intriguing, 
conjectures that our brain is in fact a quantum computer. For example, the recent 
progress in writing efficient chess playing software (Deep Blue) shows that to sim- 
ulate the world championship level using only classical algorithms, one has to be 
able to analyze about 10 6 positions/sec and use about 10 10 memory bytes. Since 
the characteristic time of neuronal processing is about 10 -3 sec, it is very difficult 
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to explain how the classical brain could possibly do the job and play chess as suc- 
cessfully as Kasparov does. A less spectacular, but not less resource consuming 
task, is speech generation and perception, which is routinely done by billions of hu- 
man brains, but still presents a formidable challenge for modern computers using 
classical algorithms. 

Computational complexity of cognitive tasks has several sources: basic variables 
can be fields; a restricted amount of small blocks can combine into exponentially 
growing trees of alternatives; databases of incompressible information have to be 
stored and searched. 

Two paradigms have been developed to cope with these difficulties: logic-like 
languages and combinatorial algorithms, and statistical matching of observed data 
to an unobserved model (see D. Mumford's paper [Mu] for a lucid discussion of the 
second paradigm.) 

In many cases, the second strategy efficiently supports an acceptable perfor- 
mance, but usually cannot achieve excellency of the Deep Blue level. Both paradigms 
require huge computational resources, and it is not clear, how they can be organized, 
unless hardware allows massive parallel computing. 

The idea of "quantum parallelism" (see sec. 2 below) is an appealing theoretical 
alternative. However, it is not at all clear that it can be made compatible with 
the available experimental evidence, which depicts the central nervous system as a 
distinctly classical device. 

The following way out might be worth exploring. The implementation of effi- 
cient quantum algorithms which have been studied so far can be provided by one, 
or several, quantum chips (registers) controlled by a classical computer. A very 
considerable part of the overall computing job, besides controlling quantum chips, 
is also assigned to the classical computer. Analyzing a physical device of such 
architecture, we would have direct access to its classical component (electrical or 
neuronal network), whereas locating its quantum components might constitute a 
considerable challenge. For example, quantum chips in the brain might be rep- 
resented by macromolecules of the type that were considered in some theoretical 
models for high temperature superconductivity. 

The difficulties are seemingly increased by the fact that quantum measurements 
produce non-deterministic outcomes. Actually, one could try to use this to one's 
advantage, because there exist situations where we can distinguish the quantum 
randomness from the classical one by analyzing the probability distributions and 
using the Bell-type inequalities. With hindsight, one recognizes in Bell's setup 
the first example of the game-like situation where quantum players can behave 
demonstrably more efficiently that the classical ones (cf. the description of this 
setup in [Ts], pp. 52-54). 

It would be extremely interesting to devise an experimental setting purporting 
to show that some fragments of the central nervous system relevant for information 
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processing can in fact be in a quantum superposition of classical states. 

(iii) Finally, we turn to mathematics. One can argue that nowadays one does 
not even need additional motivation, given the predominant mood prescribing the 
quantization of "everything that moves" . Quantum groups, quantum cohomology, 
quantum invariants of knots etc come to mind. This actually seemed to be the 
primary motivation before 1994, when P. Shor ([Sh]) devised the first quantum 
algorithm showing that prime factorization can be done on quantum computers in 
polynomial time, that is, considerably faster than by any known classical algorithm. 
(P. Shor's work was inspired by the earlier work [Si] of D. Simon). Shor's paper 
gave a new boost to the subject. Another beautiful result due to L. Grover ([Gro]) 
is that a quantum search among N objects can be done in cy/N steps. A. Kitaev 
[Kil] devised new quantum algorithms for computing stabilizers of abelian group 
actions; his work was preceded by that of D. Boneh and R. Lipton [BoL], who 
treated the more general problem by a modification of Shor's method (cf. also 
[Gri]). At least as important as the results themselves, are the tools invented by 
Shor, Grover, and Kitaev. 

Shor's work is the central subject of this lecture. It is explained in sec. 4. This 
explanation follows the discussion of the general principles of quantum computing 
and massive quantum parallelism in sec. 2, and of four quantum subroutines, 
including Grover's searching algorithm, in sec. 3. The second of these subroutines 
involving quantum computations of classical computable functions shows how to 
cope with the basic issue of quantum reversibility vs classical irreversibility. For 
more on this, see [Benl] and [Ben2]. The opening sec. 1 contains a brief report on 
the classical theory of computability. I made some effort to express certain notions 
of computer science, including P/NP, in the language of mainstream mathematics. 
The last section 5 discusses Kolmogorov complexity in the context of classical and 
quantum computations. 

Last, but not least, the hardware for quantum computing does not exist as yet: 
see 3.3 below for a brief discussion of the first attempts to engineer it. The quantum 
algorithms invented and studied up to now will stimulate the search of technological 
implementation which - if successful - will certainly correct our present understand- 
ing of quantum computing and quantum complexity. 

Acknowledgements. I am grateful to Alesha Kitaev, David Mumford, and Dimitri 
Manin for their interest and remarks on the earlier version of this report. Many of 
their suggestions are incorporated in the text. 

1. Classical theory of computation 

1.1. Constructive universe. In this section I deal only with deterministic 
computations, which can be modelled by classical discrete time dynamical systems 
and subsequently quantized. 
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Alan Turing undertook the microscopic analysis of the intuitive idea of algorith- 
mic computation. In a sense, he found its genetic code. The atom of information 
is one bit, the atomary operators can be chosen to act upon one/two bits and to 
produce the outputs of the same small size. Finally, the sequence of operations is 
strictly determined by the local environment of bounded size, again several bits. 

For a change, I proceed in the reverse direction, and start this section with a 
presentation of the macrocosm of the classical theory of computation. Categorical 
language is appropriate to this end. 

Let C be a category whose objects are countable or finite sets U. Elements x of 
these sets will generally be finite sets with additional structure. Without waiting 
for all the necessary axioms to be introduced, we will call x G U a constructive 
object of type U (an integer, a finite graph, a word in a given alphabet, a Boolean 
expression, an instance of a mass problem . . . ) The set U itself will be called the 
constructive world of objects of fixed type, and C the constructive universe. The 
category C, which will be made more concrete below, will contain all finite products 
and finite unions of its objects, and also finite sets U of all cardinalities. 

Morphisms U — > V in C are certain partial maps of the underlying sets. More 
precisely, such a morphism is a pair (D(f), f) where D(f) C U and / : D(f) — > V 
is a set-theoretic map. Composition is defined by 

(D(g),g)o(D(f)J) = (g- 1 D(f),gof). 

We will omit D(f) when it does not lead to a confusion. 

The morphisms / that we will be considering are (semi) computable functions 
U — > V. An intuitive meaning of this notion, which has a very strong heuristic 
potential, can be explained as follows: there should exist an algorithm <p such that 
if one takes as input the constructive object u G U, one of the three alternatives 
holds: 

(i) u G D(f), (p produces in a finite number of steps the output f(u) G V. 

(ii) u D(f), <p produces in a finite number of steps the standard output meaning 
NO. 

(iii) u (fi D(f), ip works for an infinitely long time without producing any output. 

The necessity of including the alternative (iii) in the definition of (semi-)computa- 
bility was an important and non-trivial discovery of the classical theory. The set 
of all morphisms U — > V is denoted C(U,V). 

The sets of the form D(f) C U are called enumerable subsets ofU. If both E C U 
and U\E are enumerable, E is called decidable. 

The classical computation theory makes all of this more precise in the following 
way. 
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1.2. Definition. A category C as above is called a constructive universe if 
it contains the constructive world N of all integers > 1, finite sets 0, {1},..., 
{1, . . . , n}, . . . and satisfies the following conditions (a) -(d). 

(a) C(N,N) is defined as the set of all partially recursive functions (see e.g. 
[Mai], Chapter V, or [Sa]). 

(b) Any infinite object of C is isomorphic to N. 

(c) If U is finite, C(U,V) consists of all partial maps U — > V. If V is finite, 
C(U, V) consists of such f that inverse image of any element of V is enumerable. 

Before stating the last condition (d), we make some comments. 

Statement (b) is a part of the famous Church Thesis. Any isomorphism (com- 
putable bijection) N — > U in C is called a numbering. Thus, two different number- 
ings of the same constructive world differ by a recursive permutation of N. We will 
call such numberings equivalent ones. Notice that because of (c) two finite construc- 
tive worlds are isomorphic iff they have the same cardinality, and the automorphism 
group of any finite U consists of all permutations of U. 

As a matter of principle, we always consider C as an open category, and at any 
moment allow ourselves to add to it new constructive worlds. If some infinite V is 
added to C, it must come together with a class of equivalent numberings. Thus, 
any finite union of constructive worlds can be naturally turned into the constructive 
world, so that the embeddings become computable morphisms, and their images 
are decidable. As another example, the world N* of finite sequences of numbers 
from N ("words in alphabet N") is endowed with Godel's numbering 

(m, n 2 , . . . , n k , . . . ) ~ 2— 1 3^- 1 . . .p^ 1 . . . (1) 

where pk is the k-th prime number. Hence we may assume that C is closed with 
respect to the construction U i— > U* . All natural functions, such as length of the 
word U* — > N, or the i-th letter of the word U* — > U are computable. 

Similarly, C can be made closed with respect to the finite direct products by 
using the (inverse) numbering of N 2 : 

(m, n) I— > m + - (m + n — l)(m + n — 2). (2) 

Projections, diagonal maps, fiber maps V — » U x V", v i— > (uo, v) are all computable. 

Decidable subsets of constructive worlds are again constructive. 

Church Thesis is often invoked as a substitute for an explicit construction of a 
numbering, and it says that the category C is defined uniquely up to equivalence. 

We now turn to the computability properties of the sets of morphisms C(U, V). 
Again, it is a matter of principle that C(U, V) itself is not a constructive world if 
U is infinite. To describe the situation axiomatically, consider first any diagram 



ev : PxU 
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in C. It defines a partial map P — > C(U, V), p i— > p, where p(u) := ev (p, it). We will 
say that the constructive world P = P(U, V) together with the evaluation map ev is 
a programming method (for computing some maps U — > V). It is called universal, 
if the following two conditions are satisfied. First, the map P — > C(C7, V) must 
be surjective. Second, for any programming method Q = Q(U,V) with the same 
source U and target V, C(Q, P) contains translation morphisms 

trans : Q(U, V) -> P(U, V) (4) 

which are, by definition, everywhere defined, computable maps Q — > P such that if 
q^> p, then q =p. 

We now complete the Definition 1.2 by adding the last axiom forming part of 
the Church Thesis: 

(d) For every two constructive worlds U,V, there exist universal programming 
methods. 

The standard examples of P for U = V = N are (formalized descriptions of) 
Turing machines, or recursive functions. 

From (d) it follows that the composition of morphisms can be lifted to a com- 
putable function on the level of programming methods. To be more precise, if Q 
(resp. P) is a programming method for U, V (resp. V, W), and R is a universal 
programming method for U, W, there exist computable composition maps 

comp : P(V, W) x Q(U, V) -> R(U, W), (p, q) h-> r (5) 

such that r = p o q. 

Concrete P(U, V) are furnished by the choice of what is called the "model of 
computations" in computer science. This last notion comes with a detailed de- 
scription not only of programs but also of all steps of the computational process. 
At this stage the models of kinematics and dynamics of the process first emerge, 
and the discussion of quantization can start. 

A formalized description of the first n steps will be called a history of computation 
or, for short, a protocol (of length n.) For a fixed model, protocols (of all lenghts) 
form a constructive world as well. We will give two formalized versions of this 
notion, for functions with infinite and finite domains respectively. The first will be 
well suited for the discussion of polynomial time computability, the second is the 
base for quantum computing. 

1.3. Models of computations I: normal models. Let U be an infinite 
constructive world. In this subsection we will be considering partial functions U — > 
U. The more general case U — > V can be reduced to this one by working with 

uuv. 
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A normal model of computations is the structure (P, U, /, F, s, ) consisting of four 
sets and a map: 

I CU, FcPxU, s: P xU ^ P xU . (6) 

Here s is an everywhere defined function such that s(p,u) = (p,s p (u)) for any 
(p, it) G P x U. Intuitively, p is a program, it is a configuration of the deterministic 
discrete time computing device, and s p (u) is the new configuration obtained from 
it after one unit of time (clock tick). Two additional subsets I C U (initial config- 
urations, or inputs) and F C P xU (final configurations) must be given, such that 
if (p, it) G F, then s(p, u) = (p, it) i.e. it is a fixed point of s p . 

In this setting, we denote by f p the partial function f p : I — > t/ such that we 
have 

it G D(f p ) and / p (ix) = i> iff for some n > 0, (p, s™(it)) G P and s™(ix) = v. (7) 

The minimal such n will be called the time (number of clock ticks) needed to 
calculate f p (u) using the program p. 

Any finite sequence 

(p, it, 8 p (u), s^(u)), «Gl, (8) 

will be called a protocol of computation of length m. 
We now add the constructivity conditions. 

We require P, £/ to be constructive worlds, s computable. In addition, we assume 
that /, F are decidable subsets of U, P x U respectively. Then f p are computable, 
and protocols of given length, (resp. of arbitrary length, resp. or those stopping at 
P), form constructive worlds. If we denote by Q the world of protocols stopping 
at P and by ev : Q x U — > U the map (p, u) \— > s™ ax (w), we get a programming 
method. 

Such a model is called universal, if the respective programming method is uni- 
versal. 

The notion of normal model of computations generalizes both normal algorithms 
and Turing machines. For their common treatment see e.g. [Sa], Chapter 4. In 
broad terms, p G P is the list of Markov substitutions, or the table defining the 
operation of a Turing machine. The remaining worlds U, /, P consist of various 
words over the working alphabet. 

1.3.1. Claim. For any U , universal normal models of computations exist, and 
can be effectively constructed. 

For U = N, this follows from the existence of universal Turing machines, and 
generally, from the Church Thesis. It is well known that the universal machine for 
calculating functions of k arguments is obtained by taking an appropriate function 



8 



of k + 1 arguments and making the first argument the variable part of the program. 
Hence P, in this case, consists of pairs (q,m), where q is the fixed program of the 
(k + Invariable universal function (hardware) and m is a word written on the tape 
(software) . 

1.4. Models of computations II: Boolean circuits. Boolean circuits are 
classical models of computation well suited for studying maps between the finite 
sets whose elements are encoded by sequences of O's and l's. 

Consider the Boolean algebra B generated over F 2 by a countable sequence of in- 
dependent variables, say x\, X2, £3, • • • This is the quotient algebra of F 2 [xi, X2, • • • ] 
with respect to the relations xf = Xi. Each Boolean polynomial determines a func- 
tion on ©°^F 2 with values in F 2 = {0, 1}. 

We start with the following simple fact. 

1.4.1. Claim. Any map f : F™ — > F?? can be represented by a unique vector of 
Boolean polynomials. 

Proof. It suffices to consider the case n = 1. Then / is represented by 

F( Xl ,...,x n ):= J2 f(y)H( x i + yi + 1 ) ( 9 ) 

y =( yi )eF™ i 

because the product in (9) is the delta function in x supported by y. Moreover, the 
spaces of maps and of Boolean polynomials have the common dimension 2 m over 
F 2 . 

Now we can calculate any vector of Boolean polynomials iterating operations 
from a small finite list, which is chosen and fixed, e.g. B := {x, 1, x + y, xy, (x, x)}. 
Such operators are called classical gates. A sequence of such operators, together 
with indication of their arguments from the previously computed bits, is called a 
Boolean circuit. The number of steps in such a circuit is considered as (a measure 
of) the time of computation. 

When the relevant finite sets are not F™ and perhaps have a wrong cardinality, 
we encode their elements by finite sequences of bits and consider the restriction of 
the Boolean polynomial to the relevant subset. 

As above, a protocol of computation in this model can be represented as the 
finite table consisting of rows (generally of variable length) which accommodate 
sequences of O's and l's. The initial line of the table is the input. Each subsequent 
line must be obtainable from the previous one by the application of one the basic 
functions in B to the sequence of neighboring bits (the remaining bits are copied 
unchanged). The last line is the output. The exact location of the bits which are 
changed in each row and the nature of change must be a part of the protocol. 

Physically, one can implement the rows as the different registers of the memory, 
or else as the consecutive states of the same register (then we have to make a 
prescription for how to cope with the variable length, e.g. using blank symbols). 
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1.4.2. Turing machines vs Boolean circuits. Any protocol of the Turing 
computation of a function can be treated as such a protocol of an appropriate 
Boolean circuit, and in this case we have only one register (the initial part of the 
tape) whose states are consecutively changed by the head/processor. We will still 
use the term "gate" in this context. 

A computable function / with infinite domain is the limit of a sequence of func- 
tions fi between finite sets whose graphs extend each other. A Turing program 
for / furnishes a computable sequence of Boolean circuits, which compute all fi in 
turn. Such a sequence is sometimes called uniform. 

1.5. Size, complexity, and polynomial time computability. The quan- 
titative theory of computational models deals simultaneously with the space and 
time dimensions of protocols. The preceding subsection focused on time, here we 
introduce space. For Boolean (and Turing machine) protocols this is easy: the 
length of each row of the protocol is the space required at that moment (plus sev- 
eral more bits for specifying the next gate). The maximum of these lengths is the 
total space required. 

The case of normal models and infinite constructive worlds is more interesting. 

Generally we will call a size function U — > N : u — > \u\ any function such that for 
every BeN, there are only finitely many objects with \u\ < B. Thus the number 
of bits \n\ = [log 2 n] + 1 and the identical function ||n|| = n are both size functions. 
Using a numbering, we can transfer them to any constructive world. In these two 
examples, the number of constructive objects of size < H grows as expcff, resp. 
cH. Such a count in more general cases allows one to make a distinction between 
the bit size, measuring the length of a description of the object, and the volume of 
the object. 

In most cases we require computability of size functions. However, there are 
exceptions: for example, Kolmogorov complexity is a non-computable size function 
with very important properties: see below and sec. 5. 

Given a size function (on all relevant worlds) and a normal model of computations 
5, we can consider the following complexity problems. 

(A) For a given morphism (computable map) f : U — > V, estimate the smallest 
size Ks(f) of the program p such that f = f p . 

Kolmogorov, Solomonoff and Chaitin proved that there exists an optimal uni- 
versal model of computations U such that, with P = N and the bit size function, 
for any other model S there exists a constant c such that for any / 

Ku(f)<K s (f) + c. 

When U is chosen, Ku(f) is called Kolmogorov's complexity of /. With a different 
choice of U we will get the same complexity function up to 0(l)-summand. 
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This complexity measure is highly non-trivial (and especially interesting) for 
an one-element world U and infinite V. It measures then the size of the most 
compressed description of a variable constructive object in V. This complexity is 
quite "objective" being almost independent of any arbitrary choices. Being un- 
computable, it cannot be directly used in computer science. However, it furnishes 
some basic restrictions on various complexity measures, somewhat similar to those 
provided by the conservation laws in physics. 

On N we have K u {n) < \n\ + 0(1) = log 2 ||n|| + 0(1). The first inequality 
"generically" can be replaced by equality, but infinitely often Ku{n) becomes much 
smaller that \n\. 

(B) For a given morphism (recursive map) / : U — > V, estimate the time needed 
to calculate f(u), u G D(f) using the program p and compare the results for different 
p and different models of computations. 

(C) The same for the function "maximal size of intermediate configurations in 
the protocol of the computation of f(u) using the program p" (space, or memory). 

In the last two problems, we have to compare functions rather than numbers: 
time and space depend on the size of input. Here a cruder polynomial scale appears 
naturally. Let us show how this happens. 

Fix a computational model S with the transition function s computing func- 
tions U — > U, and choose a bit size function on U satisfying the following crucial 
assumption: 

(•) \u\ — c < \s p (u)\ < \u\ + c where the constant c may depend on p but not on 

u. 

In this case we have |s™(-u)| < \u\ +c p m: the required space grows no more than 
linearly with time. 

Let now (S', s') be another model such that s p = s' q for some q. For example, 
such q always exists if S' is universal. Assume that s' satisfies (•) as well, and 
additionally 

(••) s can be computed in the model S' in time bounded by a polynomial F in 
the size of input. 

This requirement is certainly satisfied for Turing and Markov models, and is 
generally reasonable, because an elementary step of an algorithm deserves its name 
only if it is computationally tractable. 

Then we can replace one application of s p to s™(w) by < F(\u\ + cm) applications 
of s' q . And if we needed T(u) steps in order to calculate f p (u) using S, we will need 

no more than < Ylm=i F(\ u \ + cm ) steps to calculate the same function using S' 
and q. In a detailed model, there might be a small additional cost of merging two 
protocols. This is an example of the translation morphism (4) lifted to the worlds 
of protocols. 
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Thus, from (•) and (••) it follows that functions computable in polynomial time 
by S have the same property for all reasonable models. Notice also that for such 
functions, |/(w)| < ^(M) f° r some polynomial G and that the domain D(f) of 
such a function is decidable: if after T(|it|) s p -steps we are not in a final state, then 



polynomial time by using a fixed universal Turing machine and arguing as above 
that this definition is model-independent. 

If we want to extend it to a constructive universe C however, we will have to 
postulate additionally that any constructive world U comes together with a natural 
class of numberings which, together with their inverses, are computable in polyno- 
mial time. This seems to be a part of the content of the "polynomial Church thesis" 
invoked by M. Freedman in [Frl]. If we take this strengthening of the Church thesis 
for granted, then we can define also the bit size of an arbitrary constructive object 
as the bit size of its number with respect to one of these numberings. The quotient 
of two such size functions is bounded from above and from zero. 

Below we will be considering only the universes C and worlds U with these prop- 
erties, and \u\ will always denote one of the bit size norms. Godel's numbering (2) 
for N x N shows that that such C is still closed with respect to finite products. (No- 
tice however that the beautiful numbering (3) of N* using primes is not polynomial 
time computable; it may be replaced by another one which is in PF). 

1.6. P/NP problem. By definition, a subset E C U belongs to the class P 
iff its characteristic function xe (equal to 1 on E and outside) belongs to the 
class PF. Furthermore, E e U belongs to the class NP iff there exists a subset 
E' C U x V belonging to P and a polynomial G such that 



Here V is another world (which may coincide with U). We will say that E is 
obtained from E' by a polynomially truncated projection. 

The discussion above establishes in what sense this definition is model indepen- 
dent. 

Clearly, P C NP. The inverse inclusion is highly problematic. A naive algorithm 
calculating xe from xe 1 by searching for v with \v\ < G(\u\) and xe'(u,v) = 1 
will take exponential time e.g. when there is no such v (because \u\ is a bit size 
function). Of course, if one can treat all such v in parallell, the required time will be 
polynomial. Or else, if an oracle tells you that u G E and supplies an appropriate v, 
you can convince yourself that this is indeed so in polynomial time, by computing 
Xe,(u,v) = 1. 

Notice that the enumerable sets can be alternatively described as projections of 
decidable ones, and that in this context projection does create undecidable sets. 




N computable in 
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Nobody was able to translate the diagonalization argument used to establish this 
to the P/NP domain. M. Freedman ([Fr2]) suggested an exciting new approach 
to the problem P ^ NP(7), based upon a modification of Gromov's strategy for 
describing groups of polynomial growth. 

It has long been known that this problem can be reduced to checking whether 
some very particular sets - ./VP-complete ones - belong to P. The set E C U is 
called N P -complete if, for any other set D C V, D G NP, there exists a function 
/ : V -> U,f G PF, such that D = f-\E), that is, X d(v) = Xe(/M)- We will 
sketch the classical argument (due to S. Cooke, L. Levin, R. Karp) showing the 
existence of ./VP-complete sets. In fact, the reasoning is constructive: it furnishes 
a polynomially computable map producing / from the descriptions of xe' and of 
the truncating polynomial G. 

In order to describe one NP-complete problem, we will define an infinite family 
of Boolean polynomials b u indexed by the following data, constituting objects u of 
the constructive world U. One u is a collection 

meN; (S 1 ,T 1 ),...,(S N ,T N ), (10) 

where Si, T t C {1, . . . , m}, and b u is defined as 

b u ( Xl , . . . , x m ) = JJ j 1 + JJ (1 + x k ) Y[ x \. (11) 

The size of (10) is by definition \u\ = mN. 
Put 

E = {u E U \ 3v E F™, b u (v) = 1}. 

Using the language of Boolean truth values, one says that v satisfies b u if b u (v) = 1, 
and E is called the satisfiability problem, or SAT. 

1.6.1. Claim. E G NP. 

In fact, let 

E' = {(u, v) | b u (v) = 1}cUx (e^iFa) . (12) 

Clearly, E is the full projection of E'. A contemplation will convince the reader that 
E' G P. In fact, we can calculate b u (v) performing O(Nm) Boolean multiplications 
and additions. The projection to E can be replaced by a polynomially truncated 
projection, because we have to check only v of size \v\ <m. 

1.6.2. Claim. E is NP-complete. 

In fact, let D G NP, D C A where A is some universe. Take a representation of 
D as a polynomially truncated projection of some set D' C A x B, D' G P. Choose 
a normal, say Turing, model of computation and consider the Turing protocols of 
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computation of xd' («, b) with fixed a and variable polynomially bounded b. As we 
have explained above, for a given a, any such protocol can be imagined as a table 
of a fixed polynomially bounded size whose rows are the consecutive states of the 
computation. In the "microscopic" description, the positions in this table can be 
filled only by or 1. In addition, each row is supplied by the specification of the 
position and the inner state of the head/processor. Some of the arrangements are 
valid protocols, others are not, but the local nature of the Turing computation 
allows one to produce a Boolean polynomial b u in appropriate variables such that 
the valid protocols are recognized by the fact that this polynomial takes value 1. For 
detailed explanations see e.g. [GaJ], sec. 2.6. This defines the function / reducing 
D to E. The construction is so direct that the polynomial time computability of / 
is straightforward. 

Many natural problems are known to be A^P-complete, in particular 3-SAT. It 
is defined as the subset of SAT consisting of those u for which card (Si U Tj) =3 
for all i. 

1.6.3. Remark. Most of Boolean functions are not computable in polynomial 
time. Several versions of this statement can be proved by simple counting. 

First of all, fix a finite basis B of Boolean operations as in 1.4.1, each acting 
upon < a bits. Then sequences of these operations of length t generate 0((6n a )*) 
Boolean functions F£ — * F£ where b = card B. On the other hand, the number of 
all functions 2 n2 grows as a double exponential of n and for large n cannot be 
obtained in time t polynomially bounded in n. 

The same conclusion holds if we consider not all functions but only permutations: 
Stirling's formula for card = 2 n ! involves a double exponential. 

Here is one more variation of this problem: define the time complexity of a 
conjugacy class in S^™ as the minimal number of steps needed to calculate some 
permutation in this class. This notion arises if we are interested in calculating 
automorphisms of a finite universe of cardinality 2 n , which is not supplied with a 
specific encoding by binary words. Then it can happen that a judicious choice of 
encoding will drastically simplify the calculation of a given function. However, for 
most functions we still will not be able to achieve polynomial type computability, 
because the asymptotical formula for the number of conjugacy classes (partitions) 



again displays the double exponential growth. 

2. Quantum parallelism 

In this section we will discuss the basics: how to use the superposition principle 
in order to accelerate (certain) classical computations. 
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2.1. Description of the problem. Let N be a large number, F : {0, . . . , N — 

1} — > {0, . . . ,N — 1} a function such that the computation of each particular value 
F(x) is tractable, that is, can be done in time polynomial in logx. We want to 
compute (to recognize) some property of the graph (x,F(x)), for example: 

(i) Find the least period r of F, i.e. the least residue rmodiV such that F(x + 
rmodN) = F(x) for all x (the key step in the Factorization Problem.) 

(ii) Find some x such that F(x) = 1 or establish that such x does not exist 
(Search Problem.) 

As we already mentioned, the direct attack on such a problem consists in com- 
piling the complete list of pairs (x,F(x)) and then applying to it an algorithm 
recognizing the property in question. Such a strategy requires at least exponential 
time (as a function of the bit size of N) since already the length of the list is N. 
Barring a theoretical breakthrough in understanding such problems, (for example 
a proof that P = NP), a practical response might be in exploiting the possibility 
of parallel computing, i.e. calculating simultaneously many - or even all - values 
of F(x). This takes less time but uses (dis)proportionally more hardware. 

A remarkable suggestion due to D. Deutsch (see [DeuJ], [Deu]) consists in using 
a quantum superposition of the classical states \x) as the replacement of the union 
of N classical registers, each in one of the initial states \x). To be more precise, 
here is a mathematical model formulated as the definition. 

2.2. Quantum parallel processing: version I. Keeping the notation above, 
assume moreover that N = 2 n and that F is a bijective map (the set of all outputs 
is a permutation of the set of all inputs). 

(i) The quantum space of inputs/outputs is the 2 n -dimensional complex Hilbert 
space H n with the orthonormal basis \x), < x < N — 1. Vectors \x) are called 
classical states. 

(ii) The quantum version of F is the unique unitary operator Uf '■ H n — > H n 
such that Uf\x) — \F(x)). 

Quantum parallel computing of F is (a physical realization of) a system with the 
state space H n and the evolution operator U p ■ 

Naively speaking, if we apply Up to the initial state which is a superposition 
of all classical states with, say, equal amplitudes, we will get simultaneously all 
classical values of F (i.e. their superposition): 




(14) 



We will now discuss various issues related to this definition, before passing to its 
more realistic modification. 
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(A) We put N = 2 n above because we are imagining the respective classical 
system as an n-bit register: cf. the discussion of Boolean circuits. Every number 
0<£<iV— lis written in the binary notation x = £V €i2 l and is identified with the 
pure (classical) state |e n _i, . . . , eo) where = or 1 is the state of the z-th register. 
The quantum system H\ is called qubit. We have H n = Hf n , |e n _i, . . . , eo) = 
|e„_i) (g) • • • <g> |e ). 

This conforms to the general principles of quantum mechanics. The Hilbert 
space of the union of systems can be identified with the tensor product of the 
Hilbert spaces of the subsystems. Accordingly, decomposable vectors correspond 
to the states of the compound for which one can say that the individual subsystems 
are in definite states. 

(B) Pure quantum states, strictly speaking, are points of the projective space 
P(H n ) that is, complex lines in H n . Traditionally, one considers instead vectors 
of norm one. This leaves undetermined an overall phase factor expi<p. If we have 
two state vectors, individual phase factors have no objective meaning, but their 
quotient, that is the difference of their phases, does have one. This difference 
can be measured by observing effects of interference. This possibility is used for 
implementing efficient quantum algorithms. 

(C) If a quantum system S is isolated, its dynamical evolution is described by the 
unitary operator U(t) = expiHt where H is the Hamiltonian, t is time. Therefore 
one option for implementing Uf physically is to design a device for which Uf 
would be a fixed time evolution operator. However, this seemingly contradicts 
many deeply rooted notions of the algorithm theory. For example, calculating F(x) 
for different inputs x takes different times, and it would be highly artificial to try 
to equalize them already in the design. 

Instead, one can try to implement Uf as the result of a sequence of brief interac- 
tions, carefully controlled by a classical computer, of S with environment (say, laser 
pulses) . Mathematically speaking, Uf is represented as a product of some standard 
unitary operators U m . . . U\ each of which acts only on a small subset (two, three) 
of classical bits. These operators are called quantum gates. 

The complexity of the respective quantum computation is determined by its 
length (the number m of the gates) and by the complexity of each of them. The 
latter point is a subtle one: continuous parameters, e.g. phase shifts, on which Ui 
may depend, makes the information content of each Ui potentially infinite and leads 
to a suspicion that a quantum computer will in fact perform an analog computation, 
only implemented in a fancy way. A very interesting discussion in [Ts], Lecture 9, 
convincingly refutes this viewpoint, by displaying those features of quantum com- 
putation which distinguish it from both analog and digital classical information 
processing. This discussion is based on the technique of fault tolerant comput- 
ing using quantum codes for producing continuous variables highly protected from 
external noise. 
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(D) From the classical viewpoint, the requirement that F must be a permutation 
looks highly restrictive (for instance, in the search problem F takes only two values). 
Physically, the reason for this requirement is that only such F extend to unitary 
operators ("quantum reversibility"). The standard way out consists of introducing 
two n-bit registers instead of one, for keeping the value of the argument as well 
as that of the function. More precisely, if F(\x)) is an arbitrary function, we can 
replace it by the permutation F(\x,y)) := \x,F(x) © y), where © is the Boolean 
(bitwise) sum. This involves no more than a polynomial increase of the classical 
complexity, and the restriction of F to y = produces the graph of F which we 
need anyway for the type of problems we are interested in. 

In fact, in order to process a classical algorithm (sequence of Boolean gates) for 
computing F into the quantum one, we replace each classical gate by the respective 
reversible quantum gate, i.e. by the unitary operator corresponding to it tensored 
by the identical operator. Besides two registers for keeping \x) and F(\x)) this 
trick introduces as well extra qubits in which we are not particularly interested. 
The corresponding space and its content is sometimes referred to as "scratchpad" , 
"garbage" , etc. Besides ensuring reversibility, additional space and garbage can be 
introduced as well for considering functions F : {0, . . . , N — 1} — > {0, . . . , M — 1} 
where N, M are not powers of two (then we extend them to the closest power of 
two). For more details, see the next section. 

Notice that the choice of gate array (Boolean circuit) as the classical model 
of computation is essential in the following sense: a quantum routine cannot use 
conditional instructions. Indeed, to implement such an instruction we must observe 
the memory in the midst of calculation, but the observation generally will change 
its current quantum state. 

In the same vein, we must avoid copying instructions, because the classical copy- 
ing operator \x) — > \x) <g) \x) is not linear. In particular, each output qubit from a 
quantum gate can be used only in one gate at the next step (if several gates are 
used parallelly): cloning is not allowed. 

These examples show that the basics of quantum code writing will have a very 
distinct flavor. 

We now pass to the problems posed by the input/output routines. 

Input, or initialization, in principle can be implemented in the same way as a 
computation: we produce an input state starting e.g. from the classical state |0) 
and applying a sequence of basic unitary operators: see the next section. Output, 
however, involves an additional quantum mechanical notion: that of observation. 

(E) The simplest model of observation of a quantum system with the Hilbert 
space H involves the choice of an orthonormal basis of H. Only elements of this 
basis \xi) can appear as the results of observation. If our system is in some state \ip) 
at the moment of observation, it will be observed in the state \xi) with probability 

\(Xi\^)\ 2 - 
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This means first of all that every quantum computation is inherently probabilis- 
tic. Observing (a part of) the quantum memory is not exactly the same as "printing 
the output" . We must plan a series of runs of the same quantum program and the 
subsequent classical processing of the observed results, and we can hope only to get 
the desired answer with probability close to one. 

Furthermore, this means that by implementing quantum parallelism simplemind- 
edly as in (14), and then observing the memory as if it were the classical n-bit 
register, we will simply get some value F(x) with probability 1/N. This does not 
use the potential of the quantum parallelism. Therefore we formulate a corrected 
version of this notion, leaving more flexibility and stressing the additional tasks of 
the designer, each of which eventually contributes to the complexity estimate. 

2.3. Quantum parallel processing: version II. To solve efficiently a prob- 
lem involving properties of the graph of a function F , we must design: 

(i) An auxiliary unitary operator U carrying the relevant information about the 
graph of F. 

(ii) A computationally feasible realization ofU with the help of standard quantum 
gates. 

(Hi) A computationally feasible realization of the input subroutine. 

(iv) A computationally feasible classical algorithm processing the results of many 
runs of quantum computation. 

All of this must be supplemented by quantum error-correcting encoding, which 
we will not address here. In the next section we will discuss some standard quantum 
subroutines. 

3. Selected quantum subroutines 

3.1. Initialization. Using the same conventions as in (14) and the subsequent 
comments, in particular, the identification H n = Hf n , we have 

1 1 /I \ ® n 

In other words, 

JV-l 

-^^|x) = ^- 1) ...^ 0) |0...0) (16) 

* x=0 

where U\ : Hi — > Hi is the unitary operator 

|o)-4(l°) + l 1 ))' Ii>~4(l°>-H», 
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and =id(g)---<g)L/i<g)---<g)id acts only on the i-th. qubit. 

Thus making the quantum gate U\ act on each memory bit, one can in n steps 
initialize our register in the state which is the superposition of all 2 n classical states 
with equal weights. 

3.2. Quantum computations of classical functions. Let B be a finite basis 
of classical gates containing one-bit identity and generating all Boolean circuits, and 
F : F™ — > F2 a function. We will describe how to turn a Boolean circuit of length 
L calculating F into another Boolean circuit of comparable length consisting only 
of reversible gates, and calculating a modified function, which however contains all 
information about the graph of F. Reversibility means that each step is a bijection 
(actually, an involution) and hence can be extended to a unitary operator, that is, 
a quantum gate. For a gate /, define f(\x,y)) = \x, f{x) + y) as in 2.2(D) above. 

3.2.1. Claim. A Boolean circuit S of length L in the basis B can be pro- 
cessed into the reversible Boolean circuit S of length 0((L + m + n) 2 ) calculating a 
permutation H : F™+ n + L _> j?™+n+L w ^ ^ e f n ow i n g property: 

H(x,y,0) = (x,F(x) + y,0) = (F(x,y),0). 

Here x, y, z have sizes m, n, L respectively. 

Proof. We will understand L here as the sum of sizes of the outputs of all 
gates involved in the description of S. We first replace in S each gate / by its 
reversible counterpart /. This involves inserting extra bits which we put side by 
side into a new register of total length L. The resulting subcircuit will calculate 
a permutation K : F™ +L -> F™ +L such that K(x,0) = (F(x),G(x)) for some 
function G (garbage). 

Now add to the memory one more register of size n keeping the variable y. Extend 
K to the permutation K : Y^ +L+n -> F^ +L+n keeping y intact: K : (x, 0, y) i-> 
(F(x),G(x),y). Clearly, K is calculated by the same boolean circuit as K, but with 
extended register. 

Extend this circuit by the one adding the contents of the first and the third 
register: (F(x), G(x), y) 1— > (F(x), G(x), F(x) + y). Finally, build the last extension 
which calculates K~ x and consists of reversed gates calculating K in reverse order. 
This clears the middle register (scratchpad) and produces (x,0,F(x) + y). The 
whole circuit requires 0(L + m + n) gates if we allow the application of them to not 
necessarily neighboring bits. Otherwise we must insert gates for local permutations 
which will replace this estimate by 0((L + m + n) 2 ). 

3.3. Fast Fourier transform. Finding the least period of a function of one real 
variable can be done by calculating its Fourier transforms and looking at its maxima. 
The same strategy is applied by Shor in his solution of the factorization problem. 
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We will show now that the discrete Fourier transform <E> n is computationally easy 
(quantum polynomial time). We define $ n : H n — > H n by 



AT-l 

<& n (k» = -= J2 \c)exp(2Tricx/N) (17) 



AT-l 

** = ^=£ l c *> ex P ( 2 ™x/A0 . (18) 



N-l 

N 

c=0 

In fact, it is slightly easier to implement directly the operator 

N-l 

y'N 
v c=0 

where c* is c read from the right to the left. The effects of the bit reversal can be 
then compensated at a later stage without difficulty. 

(kj) 

Let E/j : H n —* H n , k < j, be the quantum gate which acts on the pair of the 
k-th and j-th qubits in the following way: it multiplies |11) by exp (i7r/2 J_fc ) and 
leaves the remaining classical states 1 00) , |01), 1 10) intact. 

3.3.1. Lemma. We have 

n—1 I n—1 
k=0 \ j=k+l 



By our rules of the game, (19) has polynomial length in the sense that it involves 
only 0(n 2 ) gates. However, implementation of requires controlling variable 

phase factors which tend to 1 as k — j grows. Moreover, arbitrary pairs of qubits 
must allow quantum mechanical coupling so that for large n the interaction be- 
tween qubits must be non-local. The contribution of these complications to the 
notion of complexity cannot be estimated without going into the details of physical 
arrangement. Therefore I will add a few words to this effect. 

The implementation of quantum register suggested in [CZ] consists of a collection 
of ions (charged atoms) in a linear harmonic trap (optical cavity). Two of the elec- 
tronic states of each ion are denoted |0) and |1) and represent a qubit. Laser pulses 
transmitted to the cavity through the optical fibers and controlled by the classical 
computer are used to implement gates and read out. The Coulomb repulsion keeps 
ions apart (spatial selectivity) which allows the preparation of each ion separately 
in any superposition of |0) and |1) by timing the laser pulse properly and preparing 
its phase carefully. The same Coulomb repulsion allows for collective excitations 
of the whole cluster whose quanta are called phonons. Such excitations are pro- 
duced by laser pulses as well under appropriate resonance conditions. The resulting 
resonance selectivity combined with the spatial selectivity implements a controlled 
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entanglement of the ions that can be used in order to simulate two and three bit 
gates. For a detailed and lucid mathematical explanation, see [Ts], Lecture 8. 

Another recent suggestion ([GeC]) is to use a single molecule as a quantum regis- 
ter, representing qubits by nuclear spins of individual atoms, and using interactions 
through chemical bonds in order to perform multiple bit logic. The classical tech- 
nique of nuclear magnetic resonance developed since the 1940's, which allows one 
to work with many molecules simultaneously, provides the start up technology for 
this project. 

3.4. Quantum search. All the subroutines described up to now boiled down to 
some identities in the unitary groups involving products of not too many operators 
acting on subspaces of small dimension. They did not involve output subroutines 
and therefore did not "compute" anything in the traditional sense of the word. We 
will now describe the beautiful quantum search algorithm due to L. Grover which 
produces a new identity of this type, but also demonstrates the effect of observation 
and the way one can use quantum entanglement in order to exploit the potential 
of quantum parallelism. 

We will treat only the simplest version. Let F : F?? — > {0, 1} be a function 
taking the value 1 at exactly one point xq. We want to compute x$. We assume 
that F is computable in polynomial time, or else that its values are given by an 
oracle. Classical search for xo requires on the average about N/2 evaluations of F 
where N = 2 n . 

In the quantum version, we will assume that we have a quantum Boolean circuit 
(or quantum oracle) calculating the unitary operator H n — > H n 

I F : \x)^e niF ^\x). 

In other words, Ip is the reflection inverting the sign of \xq) and leaving the re- 
maining classical states intact. 

Moreover, we put J = —Is, where 5 : F£ — > {0, 1} takes the value 1 only at 0, 
and V = U[ n ~ 1] . . . U[°\ as in (16). 

3.4.1. Claim, (i) The real plane in H n spanned by the uniform superposition £ 
of all classical states (15) and by \xq) is invariant with respect to T := VJVIf- 

(ii) T restricted to this plane is the rotation (from £ to \xo)) by the angle ip^ 
where 

cos(p N = 1 - — , sm^ A r = 2 — — — . 

The check is straightforward. 

2 

Now, 99 7v is close to —=, and for the initial angle cp between £ and \xq) we have 
v N 

1 

COS if = - 



N 
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Hence in [ip/ip^] ps — - — applications of T to £ we will get the state very close to 

|xo). Stopping the iteration of T after as many steps and measuring the outcome in 
the basis of classical states, we will obtain \xq) with probability very close to one. 

One application of T replaces in the quantum search one evaluation of F. Thus, 
thanks to quantum parallelism, we achieve a polynomial speed-up in comparison 
with the classical search. The case when F takes value 1 at several points and we 
only want to find one of them, can be treated by an extension of this method. If 
there are n such points, the algorithm requires about y^N/n steps, and n need not 
be known a priori: see [BoyBHT]. 



4. Shor's factoring algorithm 

4.1. Notation. Let M be a number to be factored. We will assume that it is 
odd and is not a power of a prime number. 

Denote by N the size of the basic memory register we will be using (not counting 
scratchpad). Its bit size n will be about twice that of M. More precisely, choose 
M 2 < N = 2 n < 2M 2 . Finally, let 1 < t < M be a random parameter with 
gcd (t, M) = 1. This condition can be checked classically in time polynomial in n. 

Below we will describe one run of Shor's algorithm, in which t (and of course, 
M, N) is fixed. Generally, polynomially many runs will be required, in which the 
value of t can remain the same or be chosen anew. This is needed in order to gather 
statistics. Shor's algorithm is a probabilistic one, with two sources of randomness 
that must be clearly distinguished. One is built into the classical probabilistic 
reduction of factoring to the finding of the period of a function. Another stems 
from the necessity of observing quantum memory, which, too, produces random 
results. 

More precise estimates than those given here show that a quantum computer 
which can store about 3n qubits can find a factor of M in time of order n 3 with 
probability close to 1 : see [BCDP]. On the other hand, it is widely believed that 
no recursive function of the type M i— > a proper factor of M belongs to PF. This 
is why the most popular public key encryption schemes rely upon the difficulty of 
the factoring problem. 

4.2. Classical algorithm. Put 



r := min {p \ t p = 1 mod M} 



which is the least period of F : a i— > t a mod M. 

4.2.1. Claim. If one can efficiently calculate r as a function oft, one can find 
a proper divisor of M in polynomial in log 2 M time with probability > 1 — M~ m for 
any fixed m. 
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Assume that for a given t the period r satisfies 

r = mod 2, t r/2 7^ -1 mod M 

Then gcd (t r / 2 + 1,M) is a proper divisor of M. Notice that gcd is computable in 
polynomial time. 

1 

The probability that this condition holds is > 1 — k _ 1 where k is the number 

of different odd prime divisors of M, hence > — in our case. Therefore we will find 

a good t with probability > 1 — M~ m in O(logM) tries. The longest calculation in 
one try is that of £ r / 2 . The usual squaring method takes polynomial time as well. 

4.3. Quantum algorithm calculating r. Here we describe one run of the 

quantum algorithm which purports to compute r, given M, N, t. We will use the 
working register that can keep a pair consisting of a variable < a < N — 1 and 
the respective value of the function t a mod M. One more register will serve as the 
scratchpad needed to compute \a,t a mod M) reversibly. When this calculation is 
completed, the content of the scratchpad will be reversibly erased: cf. 3.2.1. In the 
remaining part of the computation the scratchpad will not be used anymore, we 
can decouple it, and forget about it. 

The quantum computation consists of four steps, three of which were described 
in sec. 3: 

(i) Partial initialization produces from |0, 0) the superposition 



N-l 
7*7 £M>- 



N-l 

N , 

(ii) Reversible calculation of F processes this state into 

1 N-l 

—= V |a,t a modM). 

v a=0 

(iii) Partial Fourier transform then furnishes 

N-l N-l 

N 



^ N-l N-l 

— ^2 ^2 e xp( 27r iac/N) |c,t a modM). 



a=0 c=0 



(iv) The last step is the observation of this state with respect to the system of 
classical states |c, mmodM). This step produces some concrete output 

|c,t fc modM) (20) 
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with probability 



N 



exp (27riac/N) 



a: t a =t k mod M 



(21) 



The remaining part of the run is assigned to the classical computer and consists of 
the following steps. 

(A) Find the best approximation (in lowest terms) to — with denominator r' < 
M < VN: 



d' 



N 



< 



2N' 



(22) 



As we will see below, we may hope that r' will coincide with r in at least one 
run among at most polynomially many. Hence we try r' in the role of r right away: 

(B) Ifr' = 0mod2, calculate gcd (t r '/ 2 ± 1, M). 

If r' is odd, or if r' is even, but we did not get a proper divisor of M, repeat the 
run O(loglogM) times with the same t. In case of failure, change t and start a new 
run. 

4.3.1. Justification. We will now show that, given t, from the observed val- 
ues of \c,t k mod M) in O(loglogM) runs we can find the correct value of r with 
probability close to 1. 

Let us call the observed value of c good, if 



3/ e 



2' 2 



, rc = I mod N. 



In this case there exists such d that 



r r 
— < rc-dN = 1 < - 
2 ~ ~ 2 



so that 



c 

N 



< 



2N' 



Hence if c is good, then r' found from (22) in fact divides r. 
Now call c very good if r' = r. 

Estimating the exponential sum (21), we can easily check that the probability of 

observing a good c is > — On the other hand, there are rip(r) states |c, t k modM) 

3r 2 

with very good c. Thus to find a very good c with high probability, 0{r 2 logr) runs 
will suffice. 
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5. Kolmogorov complexity and growth of recursive functions 

Consider general functions / : N — > N. Computability theory uses several 
growth scales for such functions, of which two are most useful: / may be majorized 
by some recursive function (e.g. when it is itself recursive), or by a polynomial 
(e.g. when it is computable in polynomial time). Linear growth does not seem 
particularly relevant in this context. However, this impression is quite misleading, 
at least if one allows re-ordering N. In fact, we have: 

5.1. Claim. There exists a permutation K : N — > N such that for any partially 
recursive function f : N — > N there exists a constant c with the property 

K o / o K _1 (ra) < cn for all n G K (£)(/)). (23) 

Moreover, K is bounded by a linear function, but K _1 is not bounded by any recur- 
sive function. 

Proof. We will use the Kolmogorov complexity measure. For a recursive func- 
tion u : N — > N, x G N, put C u (x) := min{/c | f(k) = x}, or oo if such k does not 
exist. Call such a function u optimal if, for any other recursive function v, there 
exists a constant c UjV such that C u (x) < c U}V C v (x) for all x. Optimal functions do 
exist (see e.g. [Mai], Theorem VI. 9. 2); in particular, they take all positive integer 
values (however they certainly are not everywhere defined). Fix one such u and 
call C u {x) the (exponential) complexity of x. By definition, K = K u rearranges N 
in the order of increasing complexity. In other words, 

K(x) := 1 + card {y | C u (y) < C u (x)}. (24) 

We first show that 

K(x) = exp(0(l))C u (x). (25) 

Since C u takes each value at most once, it follows from (24) that K(n) < C u {n). 
In order to show that C u (x) < cK(x) for some c it suffices to check that 

card{/c < N\3x, C u (x) =k}>bN 

with some b > 0. In fact, at least half of the numbers x < N have the complexity 
which is no less than x/2. 

Now, VI.9.7(b) in [Mai] implies that, for any recursive function / and all x G 
D(f), we have C u (f(x)) < const C u (x). Since C u (x) and K(x) have the same order 
of growth up to a bounded factor, our claim follows. 

5.2. Corollary. Denote by S^ c be the group of recursive permutations of N. 
Then K^K" 1 is a subgroup of permutations of no more than linear growth. 

Actually, appealing to the Proposition VI. 9. 6 of [Mai], one can considerably 
strengthen this result. For example, let a be a recursive permutation, a K = 
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KctK 1 . Then a K (x) < cx so that (a K ) n (x) < c n x for n > 0. But actually the last 
inequality can be replaced by 

(a K ) n (x) < c'n 

for a fixed x and variable n. With both x and n variable one gets the estimate 
0(xn log (xn)). 

In the same way as finite permutations appear in the quantum versions of 
Boolean circuits, infinite (computable) permutations are natural for treating quan- 
tum Turing machines ([Deu]) and our normal computation models. In fact, if one 
assumes that the transition function s is a permutation, and then extends it to 
the unitary operator U s in the infinite-dimensional Hilbert space, one might be 
interested in studying the spectral properties of such operators. But the latter 
depend only on the conjugacy class. Perhaps the universal conjugation £/k might 
be a useful theoretical tool in this context. In the purely classical situation, (23) 
may play a role in studying the limiting behavior of polynomial time algorithms, 
as suggested in [Frl] and [Fr2]. 

Finally, I would like to comment upon the hidden role of Kolmogorov complex- 
ity in the real life of classical computing. The point is that in a sense (which is 
difficult to formalize), we are interested only in the calculation of sufficiently nice 
functions, because a random Boolean function will have (super) exponential com- 
plexity anyway. A nice function, at the very least, has a short description and, 
therefore, a small Kolmogorov complexity. Thus, dealing with practical problems, 
we actually work not with small numbers, graphs, circuits, . . . , but rather with an 
initial segment of the respective constructive world reordered with the help of K. 
We systematically replace a large object by its short description, and then try to 
overcome the computational difficulties generated by this replacement. 

Appendix 

The following text is a contribution to the prehistory of quantum computing. It 
is the translation from Russian of the last three paragraphs of the Introduction to 
[Ma2] (1980). For this reference I am grateful to A. Kitaev [Ki]. 

" Perhaps, for better understanding of this phenomenon [DNA replication], we 
need a mathematical theory of quantum automata. Such a theory would provide us 
with mathematical models of deterministic processes with quite unusual properties. 
One reason for this is that the quantum state space has far greater capacity than 
the classical one: for a classical system with N states, its quantum version allow- 
ing superposition accommodates c N states. When we join two classical systems, 
their number of states Ni and are multiplied, and in the quantum case we get 
exponential growth c NlN2 . 
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These crude estimates show that the quantum behavior of the system might 
be much more complex than its classical simulation. In particular, since there is 
no unique decomposition of a quantum system into its constituent parts, a state 
of the quantum automaton can be considered in many ways as a state of various 
virtual classical automata. Cf. the following instructive comment at the end of the 
article [Po]: 'The quantum-mechanical computation of one molecule of methane 
requires 10 42 grid points. Assuming that at each point we have to perform only 
10 elementary operations, and that the computation is performed at the extremely 
low temperature T = 3.10~ 3 if, we would still have to use all the energy produced 
on Earth during the last century.' 

The first difficulty we must overcome is the choice of the correct balance between 
the mathematical and the physical principles. The quantum automaton has to be an 
abstract one: its mathematical model must appeal only to the general principles of 
quantum physics, without prescribing a physical implementation. Then the model 
of evolution is the unitary rotation in a finite dimensional Hilbert space, and the 
decomposition of the system into its virtual parts corresponds to the tensor product 
decomposition of the state space. Somewhere in this picture we must accommodate 
interaction, which is described by density matrices and probabilities." 
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